Building a Fingerprint Based Deduplication Detection and Elimination Scheme
نویسنده
چکیده
As digital data is growing uncontrollably, the need for data reduction has become an important task in storage systems. For large scale data reduction, it is important to maximally detect and eliminate redundancy at low overheads. Data deduplication is a data reduction technique that reduces storage space by eliminating redundant data and only one instance of the data is retained on storage media. Delta compression is an efficient method for removing redundancy among non-duplicates but very similar data files and chunks. DARE is a deduplication aware, low overhead resemblance detection and elimination scheme for data reduction that uses duplicate-adjacency information for resemblance detection in a deduplication system. It also uses an improved super feature approach for further resemblance detection when DupAdj information is lacking or limited. DARE attained a superior performance of both throughput and data reduction efficiency among all resemblance detection approaches.
منابع مشابه
Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage
In a virtualized cloud cluster, frequent snapshot backup of virtual disks improves hosting reliability; however, it takes significant memory resource to detect and remove duplicated content blocks among snapshots. This paper presents a low-cost deduplication solution scalable for a large number of virtual machines. The key idea is to separate duplicate detection from the actual storage backup i...
متن کاملA Robust Strucutural Fingerprint Restoration
Fast and accurate ridge detection in fingerprints is essential to each AFIS (Automatic Fingerprint Identification System). Smudged furrows and cut ridges in the image of a finger print are major problems in any AFIS. This paper investigates a new online ridge detection method that reduces the complexity and costs associated with the fingerprint identification procedure. The noise in fingerprint...
متن کاملA Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
Data Deduplication describes approach that reduces the storage capacity needed to store data or the data has to be transfer on the network. Cloud storage has received increasing attention from industry as it offers infinite storage resources that are available on demand. Source Deduplication is useful in cloud backup that saves network bandwidth and reduces network space Deduplication is the pr...
متن کاملCryptographic Hashing Method using for Secure and Similarity Detection in Distributed Cloud Data
Received Jun 29, 2017 Revised Nov 23, 2017 Accepted Dec 17, 2017 The explosive increase of data brings new challenges to the data storage and supervision in cloud settings. These data typically have to be processed in an appropriate fashion in the cloud. Thus, any improved latency may origin animmense loss to the enterprises. Duplication detection plays a very main role in data management. Data...
متن کاملData Deduplication Report
The production of data is expanding at an astonishing pace. Data are exploding as companies and organizations collect and store increasing amounts of information. The huge amount of data require more storage, processing power and network bandwidth. To address this problem, data deduplication is being widely used. Hashing is widely used in data deduplication systems. Because hashing has many adv...
متن کامل